Overview

Dataset statistics

Number of variables8
Number of observations768
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory48.1 KiB
Average record size in memory64.2 B

Variable types

Numeric7
Categorical1

Alerts

SkinThickness is highly overall correlated with InsulinHigh correlation
Insulin is highly overall correlated with SkinThicknessHigh correlation
BloodPressure has 35 (4.6%) zerosZeros
SkinThickness has 227 (29.6%) zerosZeros
Insulin has 374 (48.7%) zerosZeros
BMI has 11 (1.4%) zerosZeros

Reproduction

Analysis started2023-12-14 05:47:26.543411
Analysis finished2023-12-14 05:47:43.720862
Duration17.18 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

Glucose
Real number (ℝ)

Distinct136
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean120.89453
Minimum0
Maximum199
Zeros5
Zeros (%)0.7%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-14T11:17:43.923054image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile79
Q199
median117
Q3140.25
95-th percentile181
Maximum199
Range199
Interquartile range (IQR)41.25

Descriptive statistics

Standard deviation31.972618
Coefficient of variation (CV)0.26446703
Kurtosis0.64077982
Mean120.89453
Median Absolute Deviation (MAD)20
Skewness0.1737535
Sum92847
Variance1022.2483
MonotonicityNot monotonic
2023-12-14T11:17:44.276116image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
99 17
 
2.2%
100 17
 
2.2%
111 14
 
1.8%
129 14
 
1.8%
125 14
 
1.8%
106 14
 
1.8%
112 13
 
1.7%
108 13
 
1.7%
95 13
 
1.7%
105 13
 
1.7%
Other values (126) 626
81.5%
ValueCountFrequency (%)
0 5
0.7%
44 1
 
0.1%
56 1
 
0.1%
57 2
 
0.3%
61 1
 
0.1%
62 1
 
0.1%
65 1
 
0.1%
67 1
 
0.1%
68 3
0.4%
71 4
0.5%
ValueCountFrequency (%)
199 1
 
0.1%
198 1
 
0.1%
197 4
0.5%
196 3
0.4%
195 2
0.3%
194 3
0.4%
193 2
0.3%
191 1
 
0.1%
190 1
 
0.1%
189 4
0.5%

BloodPressure
Real number (ℝ)

Distinct47
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.105469
Minimum0
Maximum122
Zeros35
Zeros (%)4.6%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-14T11:17:44.624644image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38.7
Q162
median72
Q380
95-th percentile90
Maximum122
Range122
Interquartile range (IQR)18

Descriptive statistics

Standard deviation19.355807
Coefficient of variation (CV)0.28009082
Kurtosis5.1801566
Mean69.105469
Median Absolute Deviation (MAD)8
Skewness-1.843608
Sum53073
Variance374.64727
MonotonicityNot monotonic
2023-12-14T11:17:44.950200image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
70 57
 
7.4%
74 52
 
6.8%
78 45
 
5.9%
68 45
 
5.9%
72 44
 
5.7%
64 43
 
5.6%
80 40
 
5.2%
76 39
 
5.1%
60 37
 
4.8%
0 35
 
4.6%
Other values (37) 331
43.1%
ValueCountFrequency (%)
0 35
4.6%
24 1
 
0.1%
30 2
 
0.3%
38 1
 
0.1%
40 1
 
0.1%
44 4
 
0.5%
46 2
 
0.3%
48 5
 
0.7%
50 13
 
1.7%
52 11
 
1.4%
ValueCountFrequency (%)
122 1
 
0.1%
114 1
 
0.1%
110 3
0.4%
108 2
0.3%
106 3
0.4%
104 2
0.3%
102 1
 
0.1%
100 3
0.4%
98 3
0.4%
96 4
0.5%

SkinThickness
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct51
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.536458
Minimum0
Maximum99
Zeros227
Zeros (%)29.6%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-14T11:17:45.280374image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median23
Q332
95-th percentile44
Maximum99
Range99
Interquartile range (IQR)32

Descriptive statistics

Standard deviation15.952218
Coefficient of variation (CV)0.77677549
Kurtosis-0.52007187
Mean20.536458
Median Absolute Deviation (MAD)12
Skewness0.1093725
Sum15772
Variance254.47325
MonotonicityNot monotonic
2023-12-14T11:17:45.569073image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 227
29.6%
32 31
 
4.0%
30 27
 
3.5%
27 23
 
3.0%
23 22
 
2.9%
33 20
 
2.6%
28 20
 
2.6%
18 20
 
2.6%
31 19
 
2.5%
19 18
 
2.3%
Other values (41) 341
44.4%
ValueCountFrequency (%)
0 227
29.6%
7 2
 
0.3%
8 2
 
0.3%
10 5
 
0.7%
11 6
 
0.8%
12 7
 
0.9%
13 11
 
1.4%
14 6
 
0.8%
15 14
 
1.8%
16 6
 
0.8%
ValueCountFrequency (%)
99 1
 
0.1%
63 1
 
0.1%
60 1
 
0.1%
56 1
 
0.1%
54 2
0.3%
52 2
0.3%
51 1
 
0.1%
50 3
0.4%
49 3
0.4%
48 4
0.5%

Insulin
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct186
Distinct (%)24.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.799479
Minimum0
Maximum846
Zeros374
Zeros (%)48.7%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-14T11:17:45.896159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median30.5
Q3127.25
95-th percentile293
Maximum846
Range846
Interquartile range (IQR)127.25

Descriptive statistics

Standard deviation115.244
Coefficient of variation (CV)1.4441699
Kurtosis7.2142596
Mean79.799479
Median Absolute Deviation (MAD)30.5
Skewness2.2722509
Sum61286
Variance13281.18
MonotonicityNot monotonic
2023-12-14T11:17:46.211415image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 374
48.7%
105 11
 
1.4%
130 9
 
1.2%
140 9
 
1.2%
120 8
 
1.0%
94 7
 
0.9%
180 7
 
0.9%
100 7
 
0.9%
135 6
 
0.8%
115 6
 
0.8%
Other values (176) 324
42.2%
ValueCountFrequency (%)
0 374
48.7%
14 1
 
0.1%
15 1
 
0.1%
16 1
 
0.1%
18 2
 
0.3%
22 1
 
0.1%
23 2
 
0.3%
25 1
 
0.1%
29 1
 
0.1%
32 1
 
0.1%
ValueCountFrequency (%)
846 1
0.1%
744 1
0.1%
680 1
0.1%
600 1
0.1%
579 1
0.1%
545 1
0.1%
543 1
0.1%
540 1
0.1%
510 1
0.1%
495 2
0.3%

BMI
Real number (ℝ)

Distinct248
Distinct (%)32.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.992578
Minimum0
Maximum67.1
Zeros11
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-14T11:17:46.519641image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21.8
Q127.3
median32
Q336.6
95-th percentile44.395
Maximum67.1
Range67.1
Interquartile range (IQR)9.3

Descriptive statistics

Standard deviation7.8841603
Coefficient of variation (CV)0.24643717
Kurtosis3.2904429
Mean31.992578
Median Absolute Deviation (MAD)4.6
Skewness-0.42898159
Sum24570.3
Variance62.159984
MonotonicityNot monotonic
2023-12-14T11:17:46.800767image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32 13
 
1.7%
31.6 12
 
1.6%
31.2 12
 
1.6%
0 11
 
1.4%
32.4 10
 
1.3%
33.3 10
 
1.3%
30.1 9
 
1.2%
32.8 9
 
1.2%
32.9 9
 
1.2%
30.8 9
 
1.2%
Other values (238) 664
86.5%
ValueCountFrequency (%)
0 11
1.4%
18.2 3
 
0.4%
18.4 1
 
0.1%
19.1 1
 
0.1%
19.3 1
 
0.1%
19.4 1
 
0.1%
19.5 2
 
0.3%
19.6 3
 
0.4%
19.9 1
 
0.1%
20 1
 
0.1%
ValueCountFrequency (%)
67.1 1
0.1%
59.4 1
0.1%
57.3 1
0.1%
55 1
0.1%
53.2 1
0.1%
52.9 1
0.1%
52.3 2
0.3%
50 1
0.1%
49.7 1
0.1%
49.6 1
0.1%

DiabetesPedigreeFunction
Real number (ℝ)

Distinct517
Distinct (%)67.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4718763
Minimum0.078
Maximum2.42
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-14T11:17:47.097061image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.078
5-th percentile0.14035
Q10.24375
median0.3725
Q30.62625
95-th percentile1.13285
Maximum2.42
Range2.342
Interquartile range (IQR)0.3825

Descriptive statistics

Standard deviation0.3313286
Coefficient of variation (CV)0.70215138
Kurtosis5.5949535
Mean0.4718763
Median Absolute Deviation (MAD)0.1675
Skewness1.9199111
Sum362.401
Variance0.10977864
MonotonicityNot monotonic
2023-12-14T11:17:47.353348image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.258 6
 
0.8%
0.254 6
 
0.8%
0.268 5
 
0.7%
0.207 5
 
0.7%
0.261 5
 
0.7%
0.259 5
 
0.7%
0.238 5
 
0.7%
0.19 4
 
0.5%
0.263 4
 
0.5%
0.299 4
 
0.5%
Other values (507) 719
93.6%
ValueCountFrequency (%)
0.078 1
0.1%
0.084 1
0.1%
0.085 2
0.3%
0.088 2
0.3%
0.089 1
0.1%
0.092 1
0.1%
0.096 1
0.1%
0.1 1
0.1%
0.101 1
0.1%
0.102 1
0.1%
ValueCountFrequency (%)
2.42 1
0.1%
2.329 1
0.1%
2.288 1
0.1%
2.137 1
0.1%
1.893 1
0.1%
1.781 1
0.1%
1.731 1
0.1%
1.699 1
0.1%
1.698 1
0.1%
1.6 1
0.1%

Age
Real number (ℝ)

Distinct52
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.240885
Minimum21
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-14T11:17:47.649528image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum21
5-th percentile21
Q124
median29
Q341
95-th percentile58
Maximum81
Range60
Interquartile range (IQR)17

Descriptive statistics

Standard deviation11.760232
Coefficient of variation (CV)0.35378816
Kurtosis0.64315889
Mean33.240885
Median Absolute Deviation (MAD)7
Skewness1.1295967
Sum25529
Variance138.30305
MonotonicityNot monotonic
2023-12-14T11:17:47.926882image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22 72
 
9.4%
21 63
 
8.2%
25 48
 
6.2%
24 46
 
6.0%
23 38
 
4.9%
28 35
 
4.6%
26 33
 
4.3%
27 32
 
4.2%
29 29
 
3.8%
31 24
 
3.1%
Other values (42) 348
45.3%
ValueCountFrequency (%)
21 63
8.2%
22 72
9.4%
23 38
4.9%
24 46
6.0%
25 48
6.2%
26 33
4.3%
27 32
4.2%
28 35
4.6%
29 29
3.8%
30 21
 
2.7%
ValueCountFrequency (%)
81 1
 
0.1%
72 1
 
0.1%
70 1
 
0.1%
69 2
0.3%
68 1
 
0.1%
67 3
0.4%
66 4
0.5%
65 3
0.4%
64 1
 
0.1%
63 4
0.5%

Outcome
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size43.6 KiB
0
500 
1
268 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters768
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Length

2023-12-14T11:17:48.362301image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-14T11:17:48.625146image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring characters

ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 768
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring scripts

ValueCountFrequency (%)
Common 768
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 500
65.1%
1 268
34.9%

Interactions

2023-12-14T11:17:40.832965image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:27.339893image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:29.379507image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:31.752662image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:34.294248image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:36.597594image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:38.605378image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:41.131459image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:27.641805image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:29.714364image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:32.038053image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:34.707541image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:36.869850image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:38.928724image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:41.464657image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:27.928398image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:30.066790image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:32.375503image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:35.103893image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:37.180475image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:39.251847image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:41.751567image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:28.206685image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:30.439679image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:32.670749image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:35.376442image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:37.450879image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:39.581376image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:42.039283image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:28.483603image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:30.708598image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:32.978488image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:35.710430image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:37.723665image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:39.870345image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:42.300944image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:28.744544image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:31.065223image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:33.253460image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:36.013459image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:38.010963image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:40.176425image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:42.574979image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:29.087275image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:31.422611image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:33.613916image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:36.313158image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:38.285799image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-12-14T11:17:40.511516image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-12-14T11:17:48.779404image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
GlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeOutcome
Glucose1.0000.2350.0600.2130.2310.0910.2850.487
BloodPressure0.2351.0000.126-0.0070.2930.0300.3510.152
SkinThickness0.0600.1261.0000.5410.4440.180-0.0670.208
Insulin0.213-0.0070.5411.0000.1930.221-0.1140.159
BMI0.2310.2930.4440.1931.0000.1410.1310.317
DiabetesPedigreeFunction0.0910.0300.1800.2210.1411.0000.0430.173
Age0.2850.351-0.067-0.1140.1310.0431.0000.314
Outcome0.4870.1520.2080.1590.3170.1730.3141.000

Missing values

2023-12-14T11:17:43.072715image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-14T11:17:43.463580image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeOutcome
01487235033.60.63501
1856629026.60.35310
2183640023.30.67321
38966239428.10.17210
4137403516843.12.29331
5116740025.60.20300
67850328831.00.25261
711500035.30.13290
8197704554330.50.16531
912596000.00.23541
GlucoseBloodPressureSkinThicknessInsulinBMIDiabetesPedigreeFunctionAgeOutcome
758106760037.50.20260
759190920035.50.28661
7608858261628.40.77220
7611707431044.00.40431
76289620022.50.14330
763101764818032.90.17630
7641227027036.80.34270
765121722311226.20.24300
766126600030.10.35471
767937031030.40.32230